Semantic relation clustering for unsupervised information extraction (Regroupement sémantique de relations pour l'extraction d'information non supervisée) [in French]
نویسندگان
چکیده
Semantic relation clustering for unsupervised information extraction Most studies in unsupervised information extraction concentrate on the relation extraction and few work has been proposed on the organization of the extracted relations. We present in this paper a two-step clustering procedure to group semantically equivalent relations : a first step clusters relations with similar expressions while a second step groups these first clusters into larger semantic clusters, using different semantic similarities. Our experiments show the stability of distributional similarities over WordNet-based similarities for semantic clustering. We also demonstrate that the use of a multi-level clustering not only reduces the calculations from all relation pairs to basic clusters pairs, but it also improves the clustering results. MOTS-CLÉS : Extraction d’Information Non Supervisée, Similarité Sémantique, Clustering.
منابع مشابه
Extraction et regroupement de relations entre entités pour l'extraction d'information non supervisée
This article takes place in the context of unsupervised information extraction in open domain and focuses on the extraction and the clustering at a large scale of relations between named entities without defining their type a priori. The extraction step combines the use of basic but efficient criteria and a filtering procedure based on machine learning. The clustering step organizes extracted r...
متن کاملRegroupement de relations pour l'extraction d'information non supervisée
The purpose of unsupervised information extraction is to extract information from text without fixing the type of information. Our work concentrates on the task of extracting and characterizing new relations between given entity types. We first propose in this article a filtering procedure to remove false relation candidates by combining heuristics and machine learning models. Best results achi...
متن کاملUnsupervised extraction of semantic relations (Extraction non supervisée de relations sémantiques lexicales) [in French]
This paper presents a knowledge base containing triples involving pairs of verbs associated with semantic or discourse relations. The relations in these triples are marked by discourse connectors between two adjacent instances of the verbs in the triple in the large French corpus, frWaC. We detail several measures that evaluate the relevance of the triples and the strength of their association....
متن کاملUnsupervised selection of semantic relations for improving a distributional thesaurus (Sélection non supervisée de relations sémantiques pour améliorer un thésaurus distributionnel) [in French]
Unsupervised selection of semantic relations for improving a distributional thesaurus Work about distributional thesauri has shown that the relations in these thesauri are mainly reliable for high frequency words. In this article, we propose a method for improving such a thesaurus through its re-balancing in favor of low frequency words. This method is based on a bootstrapping mechanism : a set...
متن کاملMéthode semi-compositionnelle pour l'extraction de synonymes des termes complexes
Automatic synonyms and semantically related word extraction is a challenging task, useful in many NLP applications such as question answering, search query expansion, text summarization, etc. While different studies addressed the task of word synonym extraction, only a few investigations tackled the problem of acquiring synonyms of multi-word terms (MWT) from specialized corpora. To extract pai...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013